Randomized Spectral Clustering in Large-Scale Stochastic Block Models
نویسندگان
چکیده
Spectral clustering has been one of the widely used methods for community detection in networks. However, large-scale networks bring computational challenge to it. In this paper, we study spectral using randomized sketching algorithms from a statistical perspective, where typically assume network data are generated stochastic block model. To do this, first use recent developed derive two algorithms, namely, random projection-based and sampling-based clustering. Then theoretical bounds resulting terms approximation error population adjacency matrix, misclustering error, estimation link probability matrix. It turns out that, under mild conditions, perform similarly original one. We also conduct numerical experiments support findings.
منابع مشابه
Data clustering using stochastic block models
It has been shown that community detection algorithms work better for clustering tasks than other, more popular methods, such as k-means. In fact, network analysis based methods often outperform more widely used methods and do not suffer from some of the drawbacks we notice elsewhere e.g. the number of clusters k usually has to be known in advance. However, stochastic block models which are kno...
متن کاملLarge-Scale Spectral Clustering on Graphs
Graph clustering has received growing attention in recent years as an important analytical technique, both due to the prevalence of graph data, and the usefulness of graph structures for exploiting intrinsic data characteristics. However, as graph data grows in scale, it becomes increasingly more challenging to identify clusters. In this paper we propose an efficient clustering algorithm for la...
متن کاملPreconditioned Spectral Clustering for Stochastic Block Partition Streaming Graph Challenge
Locally Optimal Block Preconditioned Conjugate Gradient (LOBPCG) is demonstrated to efficiently solve eigenvalue problems for graph Laplacians that appear in spectral clustering. For static graph partitioning, 10–20 iterations of LOBPCG without preconditioning result in ̃10x error reduction, enough to achieve 100% correctness for all Challenge datasets with known truth partitions, e.g., for gra...
متن کاملSpectral clustering and the high-dimensional Stochastic Block Model
Networks or graphs can easily represent a diverse set of data sources that are characterized by interacting units or actors. Social networks, representing people who communicate with each other, are one example. Communities or clusters of highly connected actors form an essential feature in the structure of several empirical networks. Spectral clustering is a popular and computationally feasibl...
متن کاملLarge-Scale Clustering in Bubble Models
We analyze the statistical properties of bubble models for the large-scale distribution of galaxies. To this aim, we realize static simulations, in which galaxies are mostly randomly arranged in the regions surrounding bubbles. As a first test, we realize simulations of the Lick map, by suitably projecting the three-dimensional simulations. In this way, we are able to safely compare the angular...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of Computational and Graphical Statistics
سال: 2022
ISSN: ['1061-8600', '1537-2715']
DOI: https://doi.org/10.1080/10618600.2022.2034636